Document Collection Visual Question Answering
نویسندگان
چکیده
Current tasks and methods in Document Understanding aims to process documents as single elements. However, are usually organized collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Collection Visual Question Answering (DocCVQA) a new dataset related task, where questions posed over whole collection of document images the goal is not only answer given question, but also retrieve set contain information needed infer answer. Along with propose evaluation metric baselines which further insights task.
منابع مشابه
Some Experiments in Question Answering with a Disambiguated Document Collection
This paper describes our approach to the Question Answering Word Sense Disambiguation task. This task consists in carrying out Question Answering over a disambiguated document collection. In our approach, disambiguated documents are used to improve the accuracy of the retrieval phase. In order to do this, we added a WordNet-expanded index to the document collection. The expanded index contains ...
متن کاملFrom Document Retrieval to Question Answering From Document Retrieval to Question Answering
متن کامل
Investigating Embedded Question Reuse in Question Answering
The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...
متن کاملRevisiting Visual Question Answering Baselines
Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding. Many of the recently proposed VQA systems include attention or memory mechanisms designed to support “reasoning”. For multiple-choice VQA, nearly all of these systems train a multi-class classifier on image and question features to predict ...
متن کاملiVQA: Inverse Visual Question Answering
In recent years, visual question answering (VQA) has become topical as a long-term goal to drive computer vision and multi-disciplinary AI research. The premise of VQA’s significance, is that both the image and textual question need to be well understood and mutually grounded in order to infer the correct answer. However, current VQA models perhaps ‘understand’ less than initially hoped, and in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-86331-9_50